Skip to content

Comments

fix: 3 critical bugs + routing prefix strip#134

Closed
dpbmaverick98 wants to merge 2 commits intoTinyAGI:mainfrom
dpbmaverick98:main
Closed

fix: 3 critical bugs + routing prefix strip#134
dpbmaverick98 wants to merge 2 commits intoTinyAGI:mainfrom
dpbmaverick98:main

Conversation

@dpbmaverick98
Copy link

Summary

  • Bug 1 — Channel response drops: Send-then-ack flow in channel clients was non-atomic, causing random response loss after the SQLite migration. Added delivering status to responses table, claim/unclaim API endpoints, and claim-before-send pattern in Telegram client with retry tracking (max 3 attempts) and periodic recovery for stuck deliveries.

  • Bug 2 — Inter-agent mention failures: Teammate mentions ([@agent: msg]) were silently dropped due to case sensitivity, missing validation logging, and a weak regex. Added case-insensitive agent ID lookup, detailed failure logging in isTeammate() and extractTeammateMentions(), improved regex for bracket handling, and a validateAgentResponse() helper.

  • Bug 3 — Multi-agent reply loss (race condition): When multiple agents completed simultaneously, concurrent conv.pending-- operations caused the conversation to never complete. Added withConversationLock() promise-chain mutex, safe incrementPending/decrementPending operations, and automatic conversation state recovery.

  • Bug 4 — @agent routing broken by message prefix: The messages API route prepends [channel/sender]: to incoming messages, which caused parseAgentRouting() to fail since messages no longer start with @agent_id. All messages fell back to the first configured agent. Fixed by stripping the prefix before parsing.

Files Changed

File Changes
src/lib/db.ts delivering status, claimResponseForDelivery(), unclaimResponse(), recoverStuckDeliveringResponses()
src/server/routes/queue.ts /api/responses/:id/claim and /unclaim endpoints
src/lib/conversation.ts withConversationLock(), incrementPending(), decrementPending(), state validation + recovery
src/queue-processor.ts Atomic pending counter ops, periodic stuck delivery recovery
src/lib/routing.ts Case-insensitive matching, detailed logging, improved regex, prefix stripping in parseAgentRouting()
src/channels/telegram-client.ts Claim-before-send, retry tracking, stale delivery cleanup
docs/bug-fixes/ Full bug analysis and solution documentation
CLAUDE.md Claude Code guidance for working in this repo

Test plan

  • npm run build compiles cleanly
  • ./tinyclaw.sh start launches queue processor + API server successfully
  • /api/queue/status returns correct stats
  • /api/responses/:id/claim and /unclaim endpoints respond correctly
  • @agent_id routing works correctly from Telegram (no longer falls back to first agent)
  • @team_id routing activates the team leader
  • Tested locally with Telegram channel connected

🤖 Generated with Claude Code

Devain Pal Bansal and others added 2 commits February 23, 2026 13:03
…s, multi-agent race condition

Bug 1 (Response Drops): Channel clients had non-atomic send-then-ack flow
causing random response loss. Added 'delivering' status to SQLite responses
table, claim/unclaim API endpoints, and claim-before-send pattern in Telegram
client with retry tracking (max 3 attempts).

Bug 2 (Inter-Agent Mention Failures): Teammate mentions were silently dropped
due to case sensitivity, typos, and missing validation logging. Added
case-insensitive agent ID lookup, detailed logging in isTeammate() and
extractTeammateMentions(), improved regex for bracket handling, and
validateAgentResponse() helper.

Bug 3 (Multi-Agent Reply Loss): Race condition on conv.pending counter when
multiple agents complete simultaneously caused replies to never arrive. Added
withConversationLock() promise-chain mutex, safe incrementPending/decrementPending
operations, and automatic conversation state recovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The messages API route prepends [channel/sender]: to incoming messages,
which caused parseAgentRouting() regex to fail since the message no
longer starts with @agent_id. This made all messages fall back to the
first agent regardless of @mentions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dpbmaverick98
Copy link
Author

image image

@jlia0
Copy link
Collaborator

jlia0 commented Feb 23, 2026

thanks this is my bad, the last commit was too aggressive, what are your thoughts on sqlite migration?

@dpbmaverick98
Copy link
Author

SQLite fixes durability, but the conversations Map is still in-memory—agent handoffs break on restart.

Proposal: Keep SQLite for the queue + add NATS for agent-to-agent coordination (request-reply pattern). When @coder calls @writer, it just waits for a response instead of juggling the Map. SQLite isn't wasted; it becomes our agent memory layer (RAG, history). Bonus: NATS makes adding a Web UI trivial (built-in WebSocket).

This fits TinyClaw's multi-agent vision without the fragility. Worth me implementing the NATS layer as a follow-up?

Also: Is this PR good to merge as-is? It's my first public contribution—happy to address any feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants